Clustering Improves the Exploration of Graph Mining Results

نویسندگان

  • Edgar H. de Graaf
  • Joost N. Kok
  • Walter A. Kosters
چکیده

Mining frequent subgraphs is an area of research where we have a given set of graphs, and where we search for (connected) subgraphs contained in many of these graphs. Each graph can be seen as a transaction, or as a molecule — as the techniques applied in this paper are used in (bio)chemical analysis. In this work we will discuss an application that enables the user to further explore the results from a frequent subgraph mining algorithm. Such an algorithm gives the frequent subgraphs, also referred to as fragments, in the graphs in the dataset. Next to frequent subgraphs the algorithm also provides a lattice that models suband supergraph relations among the fragments, which can be explored with our application. The lattice can also be used to group fragments by means of clustering algorithms, and the user can easily browse from group to group. The application can also display only a selection of groups that occur in almost the same set of molecules, or on the contrary in different molecules. This allows one to see which patterns cover different or similar parts of the dataset.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

خوشه‌بندی اسناد مبتنی بر آنتولوژی و رویکرد فازی

Data mining, also known as knowledge discovery in database, is the process to discover unknown knowledge from a large amount of data. Text mining is to apply data mining techniques to extract knowledge from unstructured text. Text clustering is one of important techniques of text mining, which is the unsupervised classification of similar documents into different groups. The most important step...

متن کامل

A clustering approach for mineral potential mapping: A deposit-scale porphyry copper exploration targeting

This work describes a knowledge-guided clustering approach for mineral potential mapping (MPM), by which the optimum number of clusters is derived form a knowledge-driven methodology through a concentration-area (C-A) multifractal analysis. To implement the proposed approach, a case study at the North Narbaghi region in the Saveh, Markazi province of Iran, was investigated to discover porphyry ...

متن کامل

Application of Probabilistic Clustering Algorithms to Determine Mineralization Areas in Regional-Scale Exploration Studies

In this work, we aim to identify the mineralization areas for the next exploration phases. Thus, the probabilistic clustering algorithms due to the use of appropriate measures, the possibility of working with datasets with missing values, and the lack of trapping in local optimal are used to determine the multi-element geochemical anomalies. Four probabilistic clustering algorithms, namely PHC,...

متن کامل

A Hybrid Data Clustering Algorithm Using Modified Krill Herd Algorithm and K-MEANS

Data clustering is the process of partitioning a set of data objects into meaning clusters or groups. Due to the vast usage of clustering algorithms in many fields, a lot of research is still going on to find the best and efficient clustering algorithm. K-means is simple and easy to implement, but it suffers from initialization of cluster center and hence trapped in local optimum. In this paper...

متن کامل

A Comparative Study between a Pseudo-Forward Equation (PFE) and Intelligence Methods for the Characterization of the North Sea Reservoir

This paper presents a comparative study between three versions of adaptive neuro-fuzzy inference system (ANFIS) algorithms and a pseudo-forward equation (PFE) to characterize the North Sea reservoir (F3 block) based on seismic data. According to the statistical studies, four attributes (energy, envelope, spectral decomposition and similarity) are known to be useful as fundamental attributes in ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007